Multitaper MFCC Features for Acoustic Stress Recognition from Speech
نویسندگان
چکیده
Ameliorating the performances of speech recognition system is a challenging problem interesting recent researchers. In this paper, we compare two extraction methods of Mel Frequency Cepstral Coefficients used to represent stressed speech utterances in order to obtain best performances. The first method known as traditional is based on single window (taper) generally the Hamming window and the second one is a novel technique developed with multitapers instead of a single taper. The extracted features are then classified using the multiclass Support Vector Machines. Experimental results on the SUSAS database have shown that the multitaper MFCC features outperform the conventional MFCCs. Keywords—Mel Frequency Cepstral Coefficients (MFCC); Multitapering; Multiclass SVM; Stress recognition
منابع مشابه
Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits
This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization feat...
متن کاملMultitaper MFCC and PLP features for speaker verification using i-vectors
In this paper we study the performance of the low-variance multi-taper Mel-frequency cepstral coefficient (MFCC) and perceptual linear prediction (PLP) features in a state-ofthe-art i-vector speaker verification system. The MFCC and PLP features are usually computed from a Hamming-windowed periodogram spectrum estimate. Such a singletapered spectrum estimate has large variance, which can be red...
متن کاملA Study of Low-variance Multi-taper Features for Distributed Speech Recognition
In this paper we study low-variance multi-taper spectrum estimation methods to compute the mel-frequency cepstral coefficient (MFCC) features for robust speech recognition. In speech recognition, MFCC features are usually computed from a Hamming-windowed DFT spectrum. Although windowing helps in reducing the bias of the spectrum, but variance remains high. Multitaper spectrum estimation methods...
متن کاملAnalysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures.
The aim of this work is to develop methods that enable acoustic speech features to be predicted from mel-frequency cepstral coefficient (MFCC) vectors as may be encountered in distributed speech recognition architectures. The work begins with a detailed analysis of the multiple correlation between acoustic speech features and MFCC vectors. This confirms the existence of correlation, which is fo...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کامل